Character-based PSMT for Closely Related Languages
نویسنده
چکیده
Translating unknown words between related languages using a character-based statistical machine translation model can be beneficial. In this paper, we describe a simple method to combine character-based models with standard word-based models to increase the coverage of a phrase-based SMT system. Using this approach, we can show a modest improvement when translating between Norwegian and Swedish. The potentials of applying character-based models to closely related languages is also illustrated by applying the character model on its own. The performance of such an approach is similar to the word-level baseline and closer to the reference in terms of string similarity.
منابع مشابه
Combining Word-Level and Character-Level Models for Machine Translation Between Closely-Related Languages
We propose several techniques for improving statistical machine translation between closely-related languages with scarce resources. We use character-level translation trained on n-gram-character-aligned bitexts and tuned using word-level BLEU, which we further augment with character-based transliteration at the word level and combine with a word-level translation model. The evaluation on Maced...
متن کاملLinguistically Motivated Reordering Modeling for Phrase-Based Statistical Machine Translation
Word reordering is one of the most difficult aspects of Statistical Machine Translation (SMT), and an important factor of its quality and efficiency. While short and mediumrange reordering is reasonably handled by the phrase-based approach (PSMT), long-range reordering still represents a challenge for state-of-the-art PSMT systems. As a major cause of this problem, we point out the inadequacy o...
متن کاملCharacter-Based Pivot Translation for Under-Resourced Languages and Domains
In this paper we investigate the use of character-level translation models to support the translation from and to underresourced languages and textual domains via closely related pivot languages. Our experiments show that these low-level models can be successful even with tiny amounts of training data. We test the approach on movie subtitles for three language pairs and legal texts for another ...
متن کاملOn The Communication Complexity of Perfectly Secure Message Transmission in Directed Networks
In this paper, we re-visit the problem of perfectly secure message transmission (PSMT) in a directed network under the presence of a threshold adaptive Byzantine adversary, having unbounded computing power. Desmedt et.al [5] have given the characterization for three or more phase PSMT protocols over directed networks. Recently, Patra et. al. [15] have given the characterization of two phase PSM...
متن کاملEfficient Perfectly Reliable and Secure Communication Tolerating Mobile Adversary
We study the problem of Perfectly Reliable Message Transmission (PRMT) and Perfectly Secure Message Transmission (PSMT) between two nodes S and R in an undirected synchronous network, a part of which is under the influence of an all powerful mobile Byzantine adversary. In ACISP’2007 Srinathan et. al. has proved that the connectivity requirement for PSMT protocols is same for both static and mob...
متن کامل